Overview

Dataset statistics

Number of variables36
Number of observations122694
Missing cells989398
Missing cells (%)22.4%
Total size in memory30.4 MiB
Average record size in memory260.0 B

Variable types

Text22
Numeric8
Boolean4
Unsupported2

Alerts

site_id has constant value ""Constant
promotions has constant value ""Constant
shipping_store_pick_up has constant value ""Constant
installments_rate has constant value ""Constant
installments_currency_id has constant value ""Constant
accepts_mercadopago is highly imbalanced (57.8%)Imbalance
condition has 2512 (2.0%) missing valuesMissing
catalog_product_id has 72249 (58.9%) missing valuesMissing
price has 1233 (1.0%) missing valuesMissing
original_price has 94174 (76.8%) missing valuesMissing
official_store_id has 107396 (87.5%) missing valuesMissing
official_store_name has 107442 (87.6%) missing valuesMissing
shipping_logistic_type has 10083 (8.2%) missing valuesMissing
shipping_benefits has 122694 (100.0%) missing valuesMissing
shipping_promise has 122694 (100.0%) missing valuesMissing
installments_quantity has 26197 (21.4%) missing valuesMissing
installments_amount has 26197 (21.4%) missing valuesMissing
installments_rate has 26197 (21.4%) missing valuesMissing
installments_currency_id has 26197 (21.4%) missing valuesMissing
brand_value_name has 19767 (16.1%) missing valuesMissing
location has 112183 (91.4%) missing valuesMissing
seller_contact has 112183 (91.4%) missing valuesMissing
price is highly skewed (γ1 = 21.66230264)Skewed
original_price is highly skewed (γ1 = 36.80616502)Skewed
available_quantity is highly skewed (γ1 = 34.46965695)Skewed
installments_amount is highly skewed (γ1 = 26.51384463)Skewed
shipping_benefits is an unsupported type, check if it needs cleaning or further analysisUnsupported
shipping_promise is an unsupported type, check if it needs cleaning or further analysisUnsupported
installments_rate has 96497 (78.6%) zerosZeros

Reproduction

Analysis started2024-02-25 01:31:00.259338
Analysis finished2024-02-25 01:31:10.090439
Duration9.83 seconds
Software versionydata-profiling vv4.6.4
Download configurationconfig.json

Variables

id
Text

Distinct115871
Distinct (%)94.4%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:10.560440image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length13
Median length13
Mean length12.64947756
Min length12

Characters and Unicode

Total characters1552015
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique109054 ?
Unique (%)88.9%

Sample

1st rowMCO1324022622
2nd rowMCO1271613775
3rd rowMCO1037658202
4th rowMCO1343307899
5th rowMCO1318891937
ValueCountFrequency (%)
mco1321892971 3
 
< 0.1%
mco1326134609 3
 
< 0.1%
mco1620188094 3
 
< 0.1%
mco1149899347 3
 
< 0.1%
mco1092024925 3
 
< 0.1%
mco1303888682 3
 
< 0.1%
mco1339099263 2
 
< 0.1%
mco585366257 2
 
< 0.1%
mco2012965406 2
 
< 0.1%
mco1213397504 2
 
< 0.1%
Other values (115861) 122668
> 99.9%
2024-02-24T20:31:11.145433image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 171665
11.1%
2 126272
8.1%
3 126070
8.1%
M 122694
 
7.9%
C 122694
 
7.9%
O 122694
 
7.9%
8 114092
 
7.4%
5 113388
 
7.3%
9 111385
 
7.2%
6 109386
 
7.0%
Other values (3) 311675
20.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 1183933
76.3%
Uppercase Letter 368082
 
23.7%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 171665
14.5%
2 126272
10.7%
3 126070
10.6%
8 114092
9.6%
5 113388
9.6%
9 111385
9.4%
6 109386
9.2%
0 107719
9.1%
4 102841
8.7%
7 101115
8.5%
Uppercase Letter
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 1183933
76.3%
Latin 368082
 
23.7%

Most frequent character per script

Common
ValueCountFrequency (%)
1 171665
14.5%
2 126272
10.7%
3 126070
10.6%
8 114092
9.6%
5 113388
9.6%
9 111385
9.4%
6 109386
9.2%
0 107719
9.1%
4 102841
8.7%
7 101115
8.5%
Latin
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1552015
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 171665
11.1%
2 126272
8.1%
3 126070
8.1%
M 122694
 
7.9%
C 122694
 
7.9%
O 122694
 
7.9%
8 114092
 
7.4%
5 113388
 
7.3%
9 111385
 
7.2%
6 109386
 
7.0%
Other values (3) 311675
20.1%

title
Text

Distinct110244
Distinct (%)89.9%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:11.544432image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length200
Median length190
Mean length52.32759548
Min length1

Characters and Unicode

Total characters6420282
Distinct characters148
Distinct categories17 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique99738 ?
Unique (%)81.3%

Sample

1st rowPila Recargables Aa X2 Energizer 2000 Mah
2nd rowPila Energizer Recharge Universal Aaa X 4und
3rd rowAudífonos In-ear Inalámbricos Bluetooth F9-5 Negro
4th rowTelevisor 43 Pulgadas Smart Android Ref. 43lo69
5th rowParlante Jbl Flip 6 Portátil Con Bluetooth Waterproof Roja 110v/220v
ValueCountFrequency (%)
de 43603
 
4.2%
28884
 
2.8%
para 18987
 
1.8%
en 10973
 
1.1%
color 10959
 
1.1%
a 10517
 
1.0%
con 8778
 
0.8%
y 8540
 
0.8%
x 7738
 
0.7%
negro 6766
 
0.7%
Other values (75968) 878283
84.9%
2024-02-24T20:31:12.254435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
937583
 
14.6%
a 564580
 
8.8%
o 449897
 
7.0%
e 433113
 
6.7%
r 363625
 
5.7%
i 321974
 
5.0%
l 268125
 
4.2%
n 254151
 
4.0%
t 236136
 
3.7%
s 217370
 
3.4%
Other values (138) 2373728
37.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 4150324
64.6%
Space Separator 938276
 
14.6%
Uppercase Letter 883908
 
13.8%
Decimal Number 338293
 
5.3%
Other Punctuation 53056
 
0.8%
Dash Punctuation 30811
 
0.5%
Math Symbol 9650
 
0.2%
Currency Symbol 7384
 
0.1%
Open Punctuation 3779
 
0.1%
Close Punctuation 3531
 
0.1%
Other values (7) 1270
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
a 564580
13.6%
o 449897
10.8%
e 433113
10.4%
r 363625
8.8%
i 321974
 
7.8%
l 268125
 
6.5%
n 254151
 
6.1%
t 236136
 
5.7%
s 217370
 
5.2%
c 151742
 
3.7%
Other values (41) 889611
21.4%
Uppercase Letter
ValueCountFrequency (%)
C 103820
 
11.7%
P 91103
 
10.3%
D 76773
 
8.7%
M 60003
 
6.8%
A 60001
 
6.8%
S 53267
 
6.0%
E 50105
 
5.7%
B 44933
 
5.1%
T 40092
 
4.5%
L 39483
 
4.5%
Other values (27) 264328
29.9%
Other Punctuation
ValueCountFrequency (%)
, 18834
35.5%
. 16096
30.3%
/ 10380
19.6%
: 1674
 
3.2%
' 1400
 
2.6%
% 1153
 
2.2%
! 1036
 
2.0%
& 853
 
1.6%
* 623
 
1.2%
# 487
 
0.9%
Other values (7) 520
 
1.0%
Decimal Number
ValueCountFrequency (%)
0 82795
24.5%
1 60365
17.8%
2 50322
14.9%
5 32435
 
9.6%
3 26951
 
8.0%
4 23971
 
7.1%
6 19229
 
5.7%
8 15949
 
4.7%
7 13355
 
3.9%
9 12921
 
3.8%
Math Symbol
ValueCountFrequency (%)
+ 9034
93.6%
| 517
 
5.4%
× 39
 
0.4%
= 35
 
0.4%
~ 16
 
0.2%
± 8
 
0.1%
÷ 1
 
< 0.1%
Other Number
ValueCountFrequency (%)
³ 31
36.9%
½ 22
26.2%
² 20
23.8%
¼ 11
 
13.1%
Open Punctuation
ValueCountFrequency (%)
( 3532
93.5%
[ 243
 
6.4%
{ 4
 
0.1%
Close Punctuation
ValueCountFrequency (%)
) 3305
93.6%
] 221
 
6.3%
} 5
 
0.1%
Other Symbol
ValueCountFrequency (%)
® 448
58.8%
° 306
40.2%
© 8
 
1.0%
Modifier Symbol
ValueCountFrequency (%)
´ 219
93.2%
¨ 9
 
3.8%
` 7
 
3.0%
Space Separator
ValueCountFrequency (%)
937583
99.9%
  693
 
0.1%
Currency Symbol
ValueCountFrequency (%)
$ 7383
> 99.9%
£ 1
 
< 0.1%
Other Letter
ValueCountFrequency (%)
ª 75
55.1%
º 61
44.9%
Dash Punctuation
ValueCountFrequency (%)
- 30811
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 45
100.0%
Final Punctuation
ValueCountFrequency (%)
» 5
100.0%
Format
ValueCountFrequency (%)
­ 3
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 5034367
78.4%
Common 1385915
 
21.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
a 564580
 
11.2%
o 449897
 
8.9%
e 433113
 
8.6%
r 363625
 
7.2%
i 321974
 
6.4%
l 268125
 
5.3%
n 254151
 
5.0%
t 236136
 
4.7%
s 217370
 
4.3%
c 151742
 
3.0%
Other values (79) 1773654
35.2%
Common
ValueCountFrequency (%)
937583
67.7%
0 82795
 
6.0%
1 60365
 
4.4%
2 50322
 
3.6%
5 32435
 
2.3%
- 30811
 
2.2%
3 26951
 
1.9%
4 23971
 
1.7%
6 19229
 
1.4%
, 18834
 
1.4%
Other values (49) 102619
 
7.4%

Most occurring blocks

ValueCountFrequency (%)
ASCII 6360220
99.1%
None 60062
 
0.9%

Most frequent character per block

ASCII
ValueCountFrequency (%)
937583
14.7%
a 564580
 
8.9%
o 449897
 
7.1%
e 433113
 
6.8%
r 363625
 
5.7%
i 321974
 
5.1%
l 268125
 
4.2%
n 254151
 
4.0%
t 236136
 
3.7%
s 217370
 
3.4%
Other values (81) 2313666
36.4%
None
ValueCountFrequency (%)
ó 13904
23.1%
á 12892
21.5%
ñ 11288
18.8%
í 9744
16.2%
é 6775
11.3%
ú 1627
 
2.7%
  693
 
1.2%
Á 639
 
1.1%
® 448
 
0.7%
° 306
 
0.5%
Other values (47) 1746
 
2.9%

condition
Text

MISSING 

Distinct3
Distinct (%)< 0.1%
Missing2512
Missing (%)2.0%
Memory size958.7 KiB
2024-02-24T20:31:12.412440image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length13
Median length3
Mean length3.08547037
Min length3

Characters and Unicode

Total characters370818
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rownew
2nd rownew
3rd rownew
4th rownew
5th rownew
ValueCountFrequency (%)
new 110396
91.9%
used 9732
 
8.1%
not_specified 54
 
< 0.1%
2024-02-24T20:31:12.744435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 120236
32.4%
n 110450
29.8%
w 110396
29.8%
s 9786
 
2.6%
d 9786
 
2.6%
u 9732
 
2.6%
i 108
 
< 0.1%
o 54
 
< 0.1%
t 54
 
< 0.1%
_ 54
 
< 0.1%
Other values (3) 162
 
< 0.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 370764
> 99.9%
Connector Punctuation 54
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 120236
32.4%
n 110450
29.8%
w 110396
29.8%
s 9786
 
2.6%
d 9786
 
2.6%
u 9732
 
2.6%
i 108
 
< 0.1%
o 54
 
< 0.1%
t 54
 
< 0.1%
p 54
 
< 0.1%
Other values (2) 108
 
< 0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 54
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 370764
> 99.9%
Common 54
 
< 0.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 120236
32.4%
n 110450
29.8%
w 110396
29.8%
s 9786
 
2.6%
d 9786
 
2.6%
u 9732
 
2.6%
i 108
 
< 0.1%
o 54
 
< 0.1%
t 54
 
< 0.1%
p 54
 
< 0.1%
Other values (2) 108
 
< 0.1%
Common
ValueCountFrequency (%)
_ 54
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 370818
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 120236
32.4%
n 110450
29.8%
w 110396
29.8%
s 9786
 
2.6%
d 9786
 
2.6%
u 9732
 
2.6%
i 108
 
< 0.1%
o 54
 
< 0.1%
t 54
 
< 0.1%
_ 54
 
< 0.1%
Other values (3) 162
 
< 0.1%

catalog_product_id
Text

MISSING 

Distinct41327
Distinct (%)81.9%
Missing72249
Missing (%)58.9%
Memory size958.7 KiB
2024-02-24T20:31:13.076436image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length13
Median length11
Mean length10.96986817
Min length9

Characters and Unicode

Total characters553375
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique34624 ?
Unique (%)68.6%

Sample

1st rowMCO21850181
2nd rowMCO22015422
3rd rowMCO16224063
4th rowMCO26796977
5th rowMCO18930465
ValueCountFrequency (%)
mco18951365 20
 
< 0.1%
mco6107994 20
 
< 0.1%
mco16268158 17
 
< 0.1%
mco19681518 17
 
< 0.1%
mco22238606 16
 
< 0.1%
mco21777210 16
 
< 0.1%
mco18706965 15
 
< 0.1%
mco8755482 15
 
< 0.1%
mco27216699 15
 
< 0.1%
mco6454377 15
 
< 0.1%
Other values (41317) 50279
99.7%
2024-02-24T20:31:13.823433image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
2 68856
12.4%
M 50445
9.1%
C 50445
9.1%
O 50445
9.1%
1 48306
8.7%
7 37907
 
6.9%
9 36461
 
6.6%
0 36259
 
6.6%
5 36133
 
6.5%
3 35525
 
6.4%
Other values (3) 102593
18.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 402040
72.7%
Uppercase Letter 151335
 
27.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
2 68856
17.1%
1 48306
12.0%
7 37907
9.4%
9 36461
9.1%
0 36259
9.0%
5 36133
9.0%
3 35525
8.8%
4 34374
8.5%
8 34197
8.5%
6 34022
8.5%
Uppercase Letter
ValueCountFrequency (%)
M 50445
33.3%
C 50445
33.3%
O 50445
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 402040
72.7%
Latin 151335
 
27.3%

Most frequent character per script

Common
ValueCountFrequency (%)
2 68856
17.1%
1 48306
12.0%
7 37907
9.4%
9 36461
9.1%
0 36259
9.0%
5 36133
9.0%
3 35525
8.8%
4 34374
8.5%
8 34197
8.5%
6 34022
8.5%
Latin
ValueCountFrequency (%)
M 50445
33.3%
C 50445
33.3%
O 50445
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 553375
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
2 68856
12.4%
M 50445
9.1%
C 50445
9.1%
O 50445
9.1%
1 48306
8.7%
7 37907
 
6.9%
9 36461
 
6.6%
0 36259
 
6.6%
5 36133
 
6.5%
3 35525
 
6.4%
Other values (3) 102593
18.5%
Distinct7
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:14.033436image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length12
Median length12
Mean length10.64477481
Min length4

Characters and Unicode

Total characters1306050
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowgold_special
2nd rowgold_special
3rd rowgold_special
4th rowgold_pro
5th rowgold_special
ValueCountFrequency (%)
gold_special 74864
61.0%
gold_pro 37313
30.4%
gold_premium 8359
 
6.8%
free 1939
 
1.6%
gold 100
 
0.1%
bronze 82
 
0.1%
silver 37
 
< 0.1%
2024-02-24T20:31:14.451434image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
l 195537
15.0%
o 158031
12.1%
g 120636
9.2%
d 120636
9.2%
_ 120536
9.2%
p 120536
9.2%
e 87220
6.7%
i 83260
6.4%
s 74901
 
5.7%
c 74864
 
5.7%
Other values (9) 149893
11.5%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1185514
90.8%
Connector Punctuation 120536
 
9.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
l 195537
16.5%
o 158031
13.3%
g 120636
10.2%
d 120636
10.2%
p 120536
10.2%
e 87220
7.4%
i 83260
7.0%
s 74901
 
6.3%
c 74864
 
6.3%
a 74864
 
6.3%
Other values (8) 75029
 
6.3%
Connector Punctuation
ValueCountFrequency (%)
_ 120536
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1185514
90.8%
Common 120536
 
9.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
l 195537
16.5%
o 158031
13.3%
g 120636
10.2%
d 120636
10.2%
p 120536
10.2%
e 87220
7.4%
i 83260
7.0%
s 74901
 
6.3%
c 74864
 
6.3%
a 74864
 
6.3%
Other values (8) 75029
 
6.3%
Common
ValueCountFrequency (%)
_ 120536
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1306050
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
l 195537
15.0%
o 158031
12.1%
g 120636
9.2%
d 120636
9.2%
_ 120536
9.2%
p 120536
9.2%
e 87220
6.7%
i 83260
6.4%
s 74901
 
5.7%
c 74864
 
5.7%
Other values (9) 149893
11.5%
Distinct115272
Distinct (%)94.0%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:14.848435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length361
Median length320
Mean length103.4475606
Min length47

Characters and Unicode

Total characters12692395
Distinct characters45
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique108040 ?
Unique (%)88.1%

Sample

1st rowhttps://www.mercadolibre.com.co/pila-recargables-aa-x2-energizer-2000-mah/p/MCO21850181
2nd rowhttps://www.mercadolibre.com.co/pila-energizer-recharge-universal-aaa-x-4und/p/MCO22015422
3rd rowhttps://www.mercadolibre.com.co/audifonos-in-ear-inalambricos-bluetooth-f9-5-negro/p/MCO16224063
4th rowhttps://www.mercadolibre.com.co/televisor-43-pulgadas-smart-android-ref-43lo69/p/MCO26796977
5th rowhttps://www.mercadolibre.com.co/parlante-jbl-flip-6-portatil-con-bluetooth-waterproof-roja-110v220v/p/MCO18930465
ValueCountFrequency (%)
https://www.mercadolibre.com.co/microfono-hyperx-quadcast-condensador-bidireccional-color-negro/p/mco15160609 5
 
< 0.1%
https://www.mercadolibre.com.co/boya-by-v1-negro/p/mco23076703 5
 
< 0.1%
https://www.mercadolibre.com.co/silla-escritorio-ejecutiva-ergonomica-reclinable-oficina-pc-color-negro-material-del-tapizado-malla/p/mco26042278 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-gamer-hyperx-duocast-black-hmid1r-a-bkg/p/mco19747373 4
 
< 0.1%
https://www.mercadolibre.com.co/audifonos-logitech-h390-diadema-color-negro/p/mco6417063 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-trust-primo-21674-condensador-omnidireccional-color-negro/p/mco17944890 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-fifine-k669-condensador-cardioide-color-negro/p/mco17469532 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-hyperx-blx-solocast-condensador-cardioide-color-negro/p/mco17481564 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-sf-666-condensador-omnidireccional-color-negro/p/mco15161567 4
 
< 0.1%
https://www.mercadolibre.com.co/microfono-razer-seiren-seiren-mini-condensador-supercardioide-color-blanco-mercurio/p/mco16650278 4
 
< 0.1%
Other values (115262) 122652
> 99.9%
2024-02-24T20:31:15.535436image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
- 1137795
 
9.0%
o 934521
 
7.4%
a 844246
 
6.7%
e 745872
 
5.9%
r 730807
 
5.8%
c 706886
 
5.6%
t 602814
 
4.7%
i 554667
 
4.4%
l 507008
 
4.0%
/ 443182
 
3.5%
Other values (35) 5484597
43.2%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 8566793
67.5%
Decimal Number 1430335
 
11.3%
Dash Punctuation 1137795
 
9.0%
Other Punctuation 933958
 
7.4%
Uppercase Letter 538370
 
4.2%
Connector Punctuation 85144
 
0.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 934521
10.9%
a 844246
 
9.9%
e 745872
 
8.7%
r 730807
 
8.5%
c 706886
 
8.3%
t 602814
 
7.0%
i 554667
 
6.5%
l 507008
 
5.9%
m 415860
 
4.9%
s 399239
 
4.7%
Other values (16) 2124873
24.8%
Decimal Number
ValueCountFrequency (%)
1 202889
14.2%
2 186371
13.0%
0 179261
12.5%
5 141198
9.9%
3 134047
9.4%
6 121589
8.5%
8 121031
8.5%
4 119978
8.4%
9 116441
8.1%
7 107530
7.5%
Uppercase Letter
ValueCountFrequency (%)
M 207838
38.6%
C 122694
22.8%
O 122694
22.8%
J 85144
15.8%
Other Punctuation
ValueCountFrequency (%)
/ 443182
47.5%
. 368082
39.4%
: 122694
 
13.1%
Dash Punctuation
ValueCountFrequency (%)
- 1137795
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 85144
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 9105163
71.7%
Common 3587232
 
28.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 934521
 
10.3%
a 844246
 
9.3%
e 745872
 
8.2%
r 730807
 
8.0%
c 706886
 
7.8%
t 602814
 
6.6%
i 554667
 
6.1%
l 507008
 
5.6%
m 415860
 
4.6%
s 399239
 
4.4%
Other values (20) 2663243
29.2%
Common
ValueCountFrequency (%)
- 1137795
31.7%
/ 443182
 
12.4%
. 368082
 
10.3%
1 202889
 
5.7%
2 186371
 
5.2%
0 179261
 
5.0%
5 141198
 
3.9%
3 134047
 
3.7%
: 122694
 
3.4%
6 121589
 
3.4%
Other values (5) 550124
15.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 12692395
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
- 1137795
 
9.0%
o 934521
 
7.4%
a 844246
 
6.7%
e 745872
 
5.9%
r 730807
 
5.8%
c 706886
 
5.6%
t 602814
 
4.7%
i 554667
 
4.4%
l 507008
 
4.0%
/ 443182
 
3.5%
Other values (35) 5484597
43.2%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:15.712435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length10
Median length10
Mean length10
Min length10

Characters and Unicode

Total characters1226940
Distinct characters16
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowbuy_it_now
2nd rowbuy_it_now
3rd rowbuy_it_now
4th rowbuy_it_now
5th rowbuy_it_now
ValueCountFrequency (%)
buy_it_now 112183
91.4%
classified 10511
 
8.6%
2024-02-24T20:31:16.067435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
_ 224366
18.3%
i 133205
10.9%
b 112183
9.1%
u 112183
9.1%
y 112183
9.1%
t 112183
9.1%
n 112183
9.1%
o 112183
9.1%
w 112183
9.1%
s 21022
 
1.7%
Other values (6) 63066
 
5.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1002574
81.7%
Connector Punctuation 224366
 
18.3%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
i 133205
13.3%
b 112183
11.2%
u 112183
11.2%
y 112183
11.2%
t 112183
11.2%
n 112183
11.2%
o 112183
11.2%
w 112183
11.2%
s 21022
 
2.1%
c 10511
 
1.0%
Other values (5) 52555
 
5.2%
Connector Punctuation
ValueCountFrequency (%)
_ 224366
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1002574
81.7%
Common 224366
 
18.3%

Most frequent character per script

Latin
ValueCountFrequency (%)
i 133205
13.3%
b 112183
11.2%
u 112183
11.2%
y 112183
11.2%
t 112183
11.2%
n 112183
11.2%
o 112183
11.2%
w 112183
11.2%
s 21022
 
2.1%
c 10511
 
1.0%
Other values (5) 52555
 
5.2%
Common
ValueCountFrequency (%)
_ 224366
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1226940
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
_ 224366
18.3%
i 133205
10.9%
b 112183
9.1%
u 112183
9.1%
y 112183
9.1%
t 112183
9.1%
n 112183
9.1%
o 112183
9.1%
w 112183
9.1%
s 21022
 
1.7%
Other values (6) 63066
 
5.1%

site_id
Text

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:16.223435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters368082
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMCO
2nd rowMCO
3rd rowMCO
4th rowMCO
5th rowMCO
ValueCountFrequency (%)
mco 122694
100.0%
2024-02-24T20:31:16.558434image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 368082
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 368082
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 368082
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%
Distinct3776
Distinct (%)3.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:16.895434image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length9
Median length9
Mean length8.247404111
Min length7

Characters and Unicode

Total characters1011907
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique691 ?
Unique (%)0.6%

Sample

1st rowMCO7279
2nd rowMCO7279
3rd rowMCO3697
4th rowMCO14903
5th rowMCO3691
ValueCountFrequency (%)
mco1196 3998
 
3.3%
mco1744 3686
 
3.0%
mco1176 3185
 
2.6%
mco8830 2063
 
1.7%
mco1442 1784
 
1.5%
mco9355 1404
 
1.1%
mco1474 1334
 
1.1%
mco435877 1222
 
1.0%
mco167011 1135
 
0.9%
mco9356 1132
 
0.9%
Other values (3766) 101751
82.9%
2024-02-24T20:31:17.500432image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 134793
13.3%
M 122694
12.1%
C 122694
12.1%
O 122694
12.1%
4 85845
8.5%
7 67315
6.7%
6 64412
6.4%
3 61716
6.1%
8 47186
 
4.7%
9 45895
 
4.5%
Other values (3) 136663
13.5%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 643825
63.6%
Uppercase Letter 368082
36.4%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 134793
20.9%
4 85845
13.3%
7 67315
10.5%
6 64412
10.0%
3 61716
9.6%
8 47186
 
7.3%
9 45895
 
7.1%
5 45799
 
7.1%
0 45586
 
7.1%
2 45278
 
7.0%
Uppercase Letter
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 643825
63.6%
Latin 368082
36.4%

Most frequent character per script

Common
ValueCountFrequency (%)
1 134793
20.9%
4 85845
13.3%
7 67315
10.5%
6 64412
10.0%
3 61716
9.6%
8 47186
 
7.3%
9 45895
 
7.1%
5 45799
 
7.1%
0 45586
 
7.1%
2 45278
 
7.0%
Latin
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1011907
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 134793
13.3%
M 122694
12.1%
C 122694
12.1%
O 122694
12.1%
4 85845
8.5%
7 67315
6.7%
6 64412
6.4%
3 61716
6.1%
8 47186
 
4.7%
9 45895
 
4.5%
Other values (3) 136663
13.5%
Distinct2746
Distinct (%)2.2%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:17.793432image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length58
Median length47
Mean length20.0546074
Min length7

Characters and Unicode

Total characters2460580
Distinct characters30
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique409 ?
Unique (%)0.3%

Sample

1st rowMCO-CELL_BATTERIES
2nd rowMCO-CELL_BATTERIES
3rd rowMCO-HEADPHONES
4th rowMCO-TELEVISIONS
5th rowMCO-SPEAKERS
ValueCountFrequency (%)
mco-books 3998
 
3.3%
mco-cars_and_vans 3686
 
3.0%
mco-music_albums 3185
 
2.6%
mco-headphones 2151
 
1.8%
mco-supplements 2127
 
1.7%
mco-cats_and_dogs_foods 1887
 
1.5%
mco-wristwatches 1784
 
1.5%
mco-coins 1404
 
1.1%
mco-individual_apartments_for_sale 1334
 
1.1%
mco-balloons 1222
 
1.0%
Other values (2736) 99916
81.4%
2024-02-24T20:31:18.349435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
O 239471
 
9.7%
C 227623
 
9.3%
S 216410
 
8.8%
E 190696
 
7.8%
M 177240
 
7.2%
A 164935
 
6.7%
_ 163312
 
6.6%
R 127161
 
5.2%
- 122694
 
5.0%
I 115750
 
4.7%
Other values (20) 715288
29.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 2174511
88.4%
Connector Punctuation 163312
 
6.6%
Dash Punctuation 122694
 
5.0%
Decimal Number 63
 
< 0.1%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
O 239471
11.0%
C 227623
10.5%
S 216410
10.0%
E 190696
 
8.8%
M 177240
 
8.2%
A 164935
 
7.6%
R 127161
 
5.8%
I 115750
 
5.3%
T 111595
 
5.1%
N 102156
 
4.7%
Other values (16) 501474
23.1%
Decimal Number
ValueCountFrequency (%)
3 60
95.2%
2 3
 
4.8%
Connector Punctuation
ValueCountFrequency (%)
_ 163312
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 122694
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2174511
88.4%
Common 286069
 
11.6%

Most frequent character per script

Latin
ValueCountFrequency (%)
O 239471
11.0%
C 227623
10.5%
S 216410
10.0%
E 190696
 
8.8%
M 177240
 
8.2%
A 164935
 
7.6%
R 127161
 
5.8%
I 115750
 
5.3%
T 111595
 
5.1%
N 102156
 
4.7%
Other values (16) 501474
23.1%
Common
ValueCountFrequency (%)
_ 163312
57.1%
- 122694
42.9%
3 60
 
< 0.1%
2 3
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2460580
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
O 239471
 
9.7%
C 227623
 
9.3%
S 216410
 
8.8%
E 190696
 
7.8%
M 177240
 
7.2%
A 164935
 
6.7%
_ 163312
 
6.6%
R 127161
 
5.2%
- 122694
 
5.0%
I 115750
 
4.7%
Other values (20) 715288
29.1%

price
Real number (ℝ)

MISSING  SKEWED 

Distinct27612
Distinct (%)22.7%
Missing1233
Missing (%)1.0%
Infinite0
Infinite (%)0.0%
Mean22944636.07
Minimum1400
Maximum9999999999
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:18.607436image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1400
5-th percentile11500
Q132900
median81500
Q3238860
95-th percentile41000000
Maximum9999999999
Range9999998599
Interquartile range (IQR)205960

Descriptive statistics

Standard deviation213319501.1
Coefficient of variation (CV)9.297140319
Kurtosis662.9554081
Mean22944636.07
Median Absolute Deviation (MAD)60500
Skewness21.66230264
Sum2.786878442 × 1012
Variance4.550520955 × 1016
MonotonicityNot monotonic
2024-02-24T20:31:18.860435image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
90000 985
 
0.8%
39900 706
 
0.6%
19900 676
 
0.6%
29900 673
 
0.5%
60000 614
 
0.5%
70000 610
 
0.5%
49900 609
 
0.5%
50000 599
 
0.5%
59900 543
 
0.4%
35000 531
 
0.4%
Other values (27602) 114915
93.7%
(Missing) 1233
 
1.0%
ValueCountFrequency (%)
1400 5
 
< 0.1%
1500 2
 
< 0.1%
2000 7
 
< 0.1%
2500 3
 
< 0.1%
2900 52
< 0.1%
ValueCountFrequency (%)
9999999999 5
< 0.1%
9999000000 1
 
< 0.1%
9350000000 1
 
< 0.1%
9000000000 1
 
< 0.1%
8700000000 1
 
< 0.1%

original_price
Real number (ℝ)

MISSING  SKEWED 

Distinct7620
Distinct (%)26.7%
Missing94174
Missing (%)76.8%
Infinite0
Infinite (%)0.0%
Mean472765.2113
Minimum3145
Maximum319900000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:19.108433image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum3145
5-th percentile14900
Q139512
median98900
Q3259900
95-th percentile1599900
Maximum319900000
Range319896855
Interquartile range (IQR)220388

Descriptive statistics

Standard deviation4351787.258
Coefficient of variation (CV)9.204965073
Kurtosis1722.660718
Mean472765.2113
Median Absolute Deviation (MAD)71901
Skewness36.80616502
Sum1.348326383 × 1010
Variance1.893805234 × 1013
MonotonicityNot monotonic
2024-02-24T20:31:19.363432image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
29900 250
 
0.2%
49900 242
 
0.2%
59900 232
 
0.2%
89900 226
 
0.2%
39900 224
 
0.2%
69900 222
 
0.2%
99900 222
 
0.2%
19900 210
 
0.2%
79900 184
 
0.1%
199900 180
 
0.1%
Other values (7610) 26328
 
21.5%
(Missing) 94174
76.8%
ValueCountFrequency (%)
3145 2
< 0.1%
3200 1
< 0.1%
3225 1
< 0.1%
3239 1
< 0.1%
3500 1
< 0.1%
ValueCountFrequency (%)
319900000 1
 
< 0.1%
178990000 1
 
< 0.1%
144990000 6
< 0.1%
144900000 1
 
< 0.1%
142900000 1
 
< 0.1%

available_quantity
Real number (ℝ)

SKEWED 

Distinct9
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean98.92498411
Minimum1
Maximum50000
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:19.566141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum1
5-th percentile1
Q11
median1
Q31
95-th percentile250
Maximum50000
Range49999
Interquartile range (IQR)0

Descriptive statistics

Standard deviation1354.370107
Coefficient of variation (CV)13.69088021
Kurtosis1259.556767
Mean98.92498411
Median Absolute Deviation (MAD)0
Skewness34.46965695
Sum12137502
Variance1834318.388
MonotonicityNot monotonic
2024-02-24T20:31:19.750140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=9)
ValueCountFrequency (%)
1 95652
78.0%
50 11619
 
9.5%
500 5437
 
4.4%
250 3147
 
2.6%
150 2635
 
2.1%
100 2572
 
2.1%
200 966
 
0.8%
5000 582
 
0.5%
50000 84
 
0.1%
ValueCountFrequency (%)
1 95652
78.0%
50 11619
 
9.5%
100 2572
 
2.1%
150 2635
 
2.1%
200 966
 
0.8%
ValueCountFrequency (%)
50000 84
 
0.1%
5000 582
 
0.5%
500 5437
4.4%
250 3147
2.6%
200 966
 
0.8%

official_store_id
Real number (ℝ)

MISSING 

Distinct551
Distinct (%)3.6%
Missing107396
Missing (%)87.5%
Infinite0
Infinite (%)0.0%
Mean4671.14394
Minimum2
Maximum65680
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:19.958139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum2
5-th percentile153
Q1608
median1278
Q31960
95-th percentile50047
Maximum65680
Range65678
Interquartile range (IQR)1352

Descriptive statistics

Standard deviation13241.22539
Coefficient of variation (CV)2.834685799
Kurtosis10.63181263
Mean4671.14394
Median Absolute Deviation (MAD)680.5
Skewness3.529640562
Sum71459160
Variance175330050
MonotonicityNot monotonic
2024-02-24T20:31:20.200144image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
1117 550
 
0.4%
1278 459
 
0.4%
1961 399
 
0.3%
2059 395
 
0.3%
2068 372
 
0.3%
1287 366
 
0.3%
663 278
 
0.2%
344 272
 
0.2%
467 239
 
0.2%
535 229
 
0.2%
Other values (541) 11739
 
9.6%
(Missing) 107396
87.5%
ValueCountFrequency (%)
2 42
 
< 0.1%
4 28
 
< 0.1%
11 95
0.1%
22 114
0.1%
23 10
 
< 0.1%
ValueCountFrequency (%)
65680 10
 
< 0.1%
65522 6
 
< 0.1%
64348 2
 
< 0.1%
63262 4
 
< 0.1%
62882 46
< 0.1%

official_store_name
Text

MISSING 

Distinct548
Distinct (%)3.6%
Missing107442
Missing (%)87.6%
Memory size958.7 KiB
2024-02-24T20:31:20.575138image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length30
Median length26
Mean length10.04301075
Min length2

Characters and Unicode

Total characters153176
Distinct characters60
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique68 ?
Unique (%)0.4%

Sample

1st rowEnergizer
2nd rowEnergizer
3rd rowChallenger
4th rowJBL
5th rowSony
ValueCountFrequency (%)
libreria 842
 
3.8%
la 759
 
3.4%
ferreteria 586
 
2.6%
croydon 550
 
2.5%
de 489
 
2.2%
u 459
 
2.0%
gm 399
 
1.8%
comunicaciones 399
 
1.8%
ipetplace 395
 
1.8%
ebook 372
 
1.7%
Other values (661) 17175
76.6%
2024-02-24T20:31:21.205142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 13964
 
9.1%
a 13094
 
8.5%
o 12732
 
8.3%
r 9990
 
6.5%
i 8924
 
5.8%
7305
 
4.8%
l 7064
 
4.6%
t 6435
 
4.2%
s 6265
 
4.1%
n 6229
 
4.1%
Other values (50) 61174
39.9%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 112653
73.5%
Uppercase Letter 32945
 
21.5%
Space Separator 7305
 
4.8%
Decimal Number 273
 
0.2%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 13964
12.4%
a 13094
11.6%
o 12732
11.3%
r 9990
8.9%
i 8924
 
7.9%
l 7064
 
6.3%
t 6435
 
5.7%
s 6265
 
5.6%
n 6229
 
5.5%
c 5060
 
4.5%
Other values (16) 22896
20.3%
Uppercase Letter
ValueCountFrequency (%)
M 2742
 
8.3%
L 2489
 
7.6%
C 2472
 
7.5%
E 2211
 
6.7%
A 2188
 
6.6%
I 1943
 
5.9%
B 1853
 
5.6%
T 1823
 
5.5%
O 1822
 
5.5%
R 1626
 
4.9%
Other values (16) 11776
35.7%
Decimal Number
ValueCountFrequency (%)
2 155
56.8%
1 75
27.5%
6 16
 
5.9%
3 14
 
5.1%
0 9
 
3.3%
4 2
 
0.7%
5 2
 
0.7%
Space Separator
ValueCountFrequency (%)
7305
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 145598
95.1%
Common 7578
 
4.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 13964
 
9.6%
a 13094
 
9.0%
o 12732
 
8.7%
r 9990
 
6.9%
i 8924
 
6.1%
l 7064
 
4.9%
t 6435
 
4.4%
s 6265
 
4.3%
n 6229
 
4.3%
c 5060
 
3.5%
Other values (42) 55841
38.4%
Common
ValueCountFrequency (%)
7305
96.4%
2 155
 
2.0%
1 75
 
1.0%
6 16
 
0.2%
3 14
 
0.2%
0 9
 
0.1%
4 2
 
< 0.1%
5 2
 
< 0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 153176
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 13964
 
9.1%
a 13094
 
8.5%
o 12732
 
8.3%
r 9990
 
6.5%
i 8924
 
5.8%
7305
 
4.8%
l 7064
 
4.6%
t 6435
 
4.2%
s 6265
 
4.1%
n 6229
 
4.1%
Other values (50) 61174
39.9%

accepts_mercadopago
Boolean

IMBALANCE 

Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size119.9 KiB
True
112183 
False
 
10511
ValueCountFrequency (%)
True 112183
91.4%
False 10511
 
8.6%
2024-02-24T20:31:21.427148image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Distinct20679
Distinct (%)16.9%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:21.685140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length24
Median length24
Mean length24
Min length24

Characters and Unicode

Total characters2944656
Distinct characters15
Distinct categories4 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique16454 ?
Unique (%)13.4%

Sample

1st row2043-03-12T04:00:00.000Z
2nd row2044-02-03T14:32:39.000Z
3rd row2042-10-29T04:00:00.000Z
4th row2044-02-01T16:24:12.000Z
5th row2043-07-20T04:00:00.000Z
ValueCountFrequency (%)
2043-08-17t04:00:00.000z 421
 
0.3%
2043-10-19t04:00:00.000z 363
 
0.3%
2043-07-19t04:00:00.000z 358
 
0.3%
2043-08-19t04:00:00.000z 335
 
0.3%
2043-08-03t04:00:00.000z 329
 
0.3%
2043-09-13t04:00:00.000z 323
 
0.3%
2043-08-23t04:00:00.000z 318
 
0.3%
2043-10-26t04:00:00.000z 317
 
0.3%
2043-07-22t04:00:00.000z 316
 
0.3%
2043-07-28t04:00:00.000z 309
 
0.3%
Other values (20669) 119305
97.2%
2024-02-24T20:31:22.168141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
0 1161075
39.4%
4 265340
 
9.0%
- 245388
 
8.3%
: 245388
 
8.3%
2 243861
 
8.3%
1 149648
 
5.1%
T 122694
 
4.2%
. 122694
 
4.2%
Z 122694
 
4.2%
3 108077
 
3.7%
Other values (5) 157797
 
5.4%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 2085798
70.8%
Other Punctuation 368082
 
12.5%
Dash Punctuation 245388
 
8.3%
Uppercase Letter 245388
 
8.3%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
0 1161075
55.7%
4 265340
 
12.7%
2 243861
 
11.7%
1 149648
 
7.2%
3 108077
 
5.2%
5 35818
 
1.7%
8 31608
 
1.5%
7 31521
 
1.5%
9 31428
 
1.5%
6 27422
 
1.3%
Other Punctuation
ValueCountFrequency (%)
: 245388
66.7%
. 122694
33.3%
Uppercase Letter
ValueCountFrequency (%)
T 122694
50.0%
Z 122694
50.0%
Dash Punctuation
ValueCountFrequency (%)
- 245388
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 2699268
91.7%
Latin 245388
 
8.3%

Most frequent character per script

Common
ValueCountFrequency (%)
0 1161075
43.0%
4 265340
 
9.8%
- 245388
 
9.1%
: 245388
 
9.1%
2 243861
 
9.0%
1 149648
 
5.5%
. 122694
 
4.5%
3 108077
 
4.0%
5 35818
 
1.3%
8 31608
 
1.2%
Other values (3) 90371
 
3.3%
Latin
ValueCountFrequency (%)
T 122694
50.0%
Z 122694
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2944656
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
0 1161075
39.4%
4 265340
 
9.0%
- 245388
 
8.3%
: 245388
 
8.3%
2 243861
 
8.3%
1 149648
 
5.1%
T 122694
 
4.2%
. 122694
 
4.2%
Z 122694
 
4.2%
3 108077
 
3.7%
Other values (5) 157797
 
5.4%
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size119.9 KiB
False
85144 
True
37550 
ValueCountFrequency (%)
False 85144
69.4%
True 37550
30.6%
2024-02-24T20:31:22.383142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

promotions
Text

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:22.479143image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length2
Median length2
Mean length2
Min length2

Characters and Unicode

Total characters245388
Distinct characters2
Distinct categories2 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st row[]
2nd row[]
3rd row[]
4th row[]
5th row[]
ValueCountFrequency (%)
122694
100.0%
2024-02-24T20:31:22.794139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
[ 122694
50.0%
] 122694
50.0%

Most occurring categories

ValueCountFrequency (%)
Open Punctuation 122694
50.0%
Close Punctuation 122694
50.0%

Most frequent character per category

Open Punctuation
ValueCountFrequency (%)
[ 122694
100.0%
Close Punctuation
ValueCountFrequency (%)
] 122694
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 245388
100.0%

Most frequent character per script

Common
ValueCountFrequency (%)
[ 122694
50.0%
] 122694
50.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 245388
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
[ 122694
50.0%
] 122694
50.0%
Distinct32
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:23.029140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length9
Median length7
Mean length7.328418668
Min length7

Characters and Unicode

Total characters899153
Distinct characters13
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowMCO1000
2nd rowMCO1000
3rd rowMCO1000
4th rowMCO1000
5th rowMCO1000
ValueCountFrequency (%)
mco1368 4000
 
3.3%
mco1367 4000
 
3.3%
mco1743 4000
 
3.3%
mco1430 4000
 
3.3%
mco1168 4000
 
3.3%
mco118204 4000
 
3.3%
mco1403 4000
 
3.3%
mco175794 4000
 
3.3%
mco1276 4000
 
3.3%
mco180800 3999
 
3.3%
Other values (22) 82695
67.4%
2024-02-24T20:31:23.510139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
1 142370
15.8%
M 122694
13.6%
C 122694
13.6%
O 122694
13.6%
4 67044
7.5%
0 58769
6.5%
7 51944
 
5.8%
3 48598
 
5.4%
8 35980
 
4.0%
9 35963
 
4.0%
Other values (3) 90403
10.1%

Most occurring categories

ValueCountFrequency (%)
Decimal Number 531071
59.1%
Uppercase Letter 368082
40.9%

Most frequent character per category

Decimal Number
ValueCountFrequency (%)
1 142370
26.8%
4 67044
12.6%
0 58769
11.1%
7 51944
 
9.8%
3 48598
 
9.2%
8 35980
 
6.8%
9 35963
 
6.8%
2 31979
 
6.0%
5 30446
 
5.7%
6 27978
 
5.3%
Uppercase Letter
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring scripts

ValueCountFrequency (%)
Common 531071
59.1%
Latin 368082
40.9%

Most frequent character per script

Common
ValueCountFrequency (%)
1 142370
26.8%
4 67044
12.6%
0 58769
11.1%
7 51944
 
9.8%
3 48598
 
9.2%
8 35980
 
6.8%
9 35963
 
6.8%
2 31979
 
6.0%
5 30446
 
5.7%
6 27978
 
5.3%
Latin
ValueCountFrequency (%)
M 122694
33.3%
C 122694
33.3%
O 122694
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 899153
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
1 142370
15.8%
M 122694
13.6%
C 122694
13.6%
O 122694
13.6%
4 67044
7.5%
0 58769
6.5%
7 51944
 
5.8%
3 48598
 
5.4%
8 35980
 
4.0%
9 35963
 
4.0%
Other values (3) 90403
10.1%

shipping_store_pick_up
Boolean

CONSTANT 

Distinct1
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size119.9 KiB
False
122694 
ValueCountFrequency (%)
False 122694
100.0%
2024-02-24T20:31:23.701142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Distinct2
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size119.9 KiB
False
70954 
True
51740 
ValueCountFrequency (%)
False 70954
57.8%
True 51740
42.2%
2024-02-24T20:31:23.844139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Distinct7
Distinct (%)< 0.1%
Missing10083
Missing (%)8.2%
Memory size958.7 KiB
2024-02-24T20:31:24.161143image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length13
Median length11
Mean length11.45898713
Min length6

Characters and Unicode

Total characters1290408
Distinct characters19
Distinct categories2 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowfulfillment
2nd rowfulfillment
3rd rowcross_docking
4th rowcross_docking
5th rowfulfillment
ValueCountFrequency (%)
xd_drop_off 45607
40.5%
cross_docking 38287
34.0%
fulfillment 15212
 
13.5%
drop_off 9411
 
8.4%
not_specified 3389
 
3.0%
custom 612
 
0.5%
default 93
 
0.1%
2024-02-24T20:31:24.563141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
o 190611
14.8%
f 143942
11.2%
d 142394
11.0%
_ 142301
11.0%
r 93305
 
7.2%
c 80575
 
6.2%
s 80575
 
6.2%
i 60277
 
4.7%
p 58407
 
4.5%
n 56888
 
4.4%
Other values (9) 241133
18.7%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1148107
89.0%
Connector Punctuation 142301
 
11.0%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 190611
16.6%
f 143942
12.5%
d 142394
12.4%
r 93305
8.1%
c 80575
7.0%
s 80575
7.0%
i 60277
 
5.3%
p 58407
 
5.1%
n 56888
 
5.0%
l 45729
 
4.0%
Other values (8) 195404
17.0%
Connector Punctuation
ValueCountFrequency (%)
_ 142301
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1148107
89.0%
Common 142301
 
11.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 190611
16.6%
f 143942
12.5%
d 142394
12.4%
r 93305
8.1%
c 80575
7.0%
s 80575
7.0%
i 60277
 
5.3%
p 58407
 
5.1%
n 56888
 
5.0%
l 45729
 
4.0%
Other values (8) 195404
17.0%
Common
ValueCountFrequency (%)
_ 142301
100.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1290408
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
o 190611
14.8%
f 143942
11.2%
d 142394
11.0%
_ 142301
11.0%
r 93305
 
7.2%
c 80575
 
6.2%
s 80575
 
6.2%
i 60277
 
4.7%
p 58407
 
4.5%
n 56888
 
4.4%
Other values (9) 241133
18.7%
Distinct4
Distinct (%)< 0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:24.725141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length13
Median length3
Mean length4.11298026
Min length3

Characters and Unicode

Total characters504638
Distinct characters15
Distinct categories3 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowme2
2nd rowme2
3rd rowme2
4th rowme2
5th rowme2
ValueCountFrequency (%)
me2 108517
88.4%
not_specified 13472
 
11.0%
custom 612
 
0.5%
me1 93
 
0.1%
2024-02-24T20:31:25.071138image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 135554
26.9%
m 109222
21.6%
2 108517
21.5%
i 26944
 
5.3%
o 14084
 
2.8%
t 14084
 
2.8%
s 14084
 
2.8%
c 14084
 
2.8%
n 13472
 
2.7%
_ 13472
 
2.7%
Other values (5) 41121
 
8.1%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 382556
75.8%
Decimal Number 108610
 
21.5%
Connector Punctuation 13472
 
2.7%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 135554
35.4%
m 109222
28.6%
i 26944
 
7.0%
o 14084
 
3.7%
t 14084
 
3.7%
s 14084
 
3.7%
c 14084
 
3.7%
n 13472
 
3.5%
p 13472
 
3.5%
f 13472
 
3.5%
Other values (2) 14084
 
3.7%
Decimal Number
ValueCountFrequency (%)
2 108517
99.9%
1 93
 
0.1%
Connector Punctuation
ValueCountFrequency (%)
_ 13472
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 382556
75.8%
Common 122082
 
24.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 135554
35.4%
m 109222
28.6%
i 26944
 
7.0%
o 14084
 
3.7%
t 14084
 
3.7%
s 14084
 
3.7%
c 14084
 
3.7%
n 13472
 
3.5%
p 13472
 
3.5%
f 13472
 
3.5%
Other values (2) 14084
 
3.7%
Common
ValueCountFrequency (%)
2 108517
88.9%
_ 13472
 
11.0%
1 93
 
0.1%

Most occurring blocks

ValueCountFrequency (%)
ASCII 504638
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 135554
26.9%
m 109222
21.6%
2 108517
21.5%
i 26944
 
5.3%
o 14084
 
2.8%
t 14084
 
2.8%
s 14084
 
2.8%
c 14084
 
2.8%
n 13472
 
2.7%
_ 13472
 
2.7%
Other values (5) 41121
 
8.1%
Distinct127
Distinct (%)0.1%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:25.279146image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length135
Median length132
Mean length24.11475704
Min length2

Characters and Unicode

Total characters2958736
Distinct characters37
Distinct categories9 ?
Distinct scripts2 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique24 ?
Unique (%)< 0.1%

Sample

1st row['fulfillment', 'self_service_out']
2nd row['fulfillment', 'self_service_in']
3rd row['self_service_in']
4th row['mandatory_free_shipping', 'optional_me2_chosen']
5th row['fulfillment', 'self_service_in', 'mandatory_free_shipping']
ValueCountFrequency (%)
self_service_in 54448
33.7%
mandatory_free_shipping 46987
29.1%
29448
18.2%
fulfillment 15212
 
9.4%
self_service_out 7682
 
4.8%
mco-chg-threshold-jan-23 3497
 
2.2%
fs_threshold_mco_change_apr2021 1182
 
0.7%
fs_threshold_mco_change_jul2021 1065
 
0.7%
fbm_in_process 815
 
0.5%
is_flammable 761
 
0.5%
Other values (8) 447
 
0.3%
2024-02-24T20:31:25.785141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 306126
 
10.3%
' 264192
 
8.9%
_ 230696
 
7.8%
i 228109
 
7.7%
s 182387
 
6.2%
n 170924
 
5.8%
r 163999
 
5.5%
f 143447
 
4.8%
[ 122694
 
4.1%
] 122694
 
4.1%
Other values (27) 1023468
34.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 2096443
70.9%
Other Punctuation 303042
 
10.2%
Connector Punctuation 230696
 
7.8%
Open Punctuation 122694
 
4.1%
Close Punctuation 122694
 
4.1%
Space Separator 38850
 
1.3%
Decimal Number 16341
 
0.6%
Dash Punctuation 13988
 
0.5%
Uppercase Letter 13988
 
0.5%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 306126
14.6%
i 228109
10.9%
s 182387
 
8.7%
n 170924
 
8.2%
r 163999
 
7.8%
f 143447
 
6.8%
l 116702
 
5.6%
a 103113
 
4.9%
p 96228
 
4.6%
t 75994
 
3.6%
Other values (12) 509414
24.3%
Decimal Number
ValueCountFrequency (%)
2 8320
50.9%
3 3497
21.4%
1 2277
 
13.9%
0 2247
 
13.8%
Uppercase Letter
ValueCountFrequency (%)
M 3497
25.0%
C 3497
25.0%
O 3497
25.0%
J 3497
25.0%
Other Punctuation
ValueCountFrequency (%)
' 264192
87.2%
, 38850
 
12.8%
Connector Punctuation
ValueCountFrequency (%)
_ 230696
100.0%
Open Punctuation
ValueCountFrequency (%)
[ 122694
100.0%
Close Punctuation
ValueCountFrequency (%)
] 122694
100.0%
Space Separator
ValueCountFrequency (%)
38850
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 13988
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 2110431
71.3%
Common 848305
28.7%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 306126
14.5%
i 228109
10.8%
s 182387
 
8.6%
n 170924
 
8.1%
r 163999
 
7.8%
f 143447
 
6.8%
l 116702
 
5.5%
a 103113
 
4.9%
p 96228
 
4.6%
t 75994
 
3.6%
Other values (16) 523402
24.8%
Common
ValueCountFrequency (%)
' 264192
31.1%
_ 230696
27.2%
[ 122694
14.5%
] 122694
14.5%
, 38850
 
4.6%
38850
 
4.6%
- 13988
 
1.6%
2 8320
 
1.0%
3 3497
 
0.4%
1 2277
 
0.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 2958736
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 306126
 
10.3%
' 264192
 
8.9%
_ 230696
 
7.8%
i 228109
 
7.7%
s 182387
 
6.2%
n 170924
 
5.8%
r 163999
 
5.5%
f 143447
 
4.8%
[ 122694
 
4.1%
] 122694
 
4.1%
Other values (27) 1023468
34.6%

shipping_benefits
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing122694
Missing (%)100.0%
Memory size958.7 KiB

shipping_promise
Unsupported

MISSING  REJECTED  UNSUPPORTED 

Missing122694
Missing (%)100.0%
Memory size958.7 KiB

seller_id
Real number (ℝ)

Distinct15024
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Infinite0
Infinite (%)0.0%
Mean461490403
Minimum17434
Maximum1699297072
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:26.021144image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum17434
5-th percentile39773006.3
Q1160820235
median311862503
Q3637614324
95-th percentile1357110642
Maximum1699297072
Range1699279638
Interquartile range (IQR)476794089

Descriptive statistics

Standard deviation409865390.3
Coefficient of variation (CV)0.8881341575
Kurtosis0.644099614
Mean461490403
Median Absolute Deviation (MAD)207755224
Skewness1.219153028
Sum5.662210351 × 1013
Variance1.679896381 × 1017
MonotonicityNot monotonic
2024-02-24T20:31:26.274141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
181939816 1094
 
0.9%
791273252 885
 
0.7%
1194088518 848
 
0.7%
394380898 662
 
0.5%
91086457 605
 
0.5%
313600420 551
 
0.4%
573609394 490
 
0.4%
178562922 473
 
0.4%
279263455 459
 
0.4%
438128163 426
 
0.3%
Other values (15014) 116201
94.7%
ValueCountFrequency (%)
17434 24
< 0.1%
51545 1
 
< 0.1%
79421 1
 
< 0.1%
83777 1
 
< 0.1%
86705 25
< 0.1%
ValueCountFrequency (%)
1699297072 1
< 0.1%
1699214406 1
< 0.1%
1699209680 1
< 0.1%
1699183850 1
< 0.1%
1699177230 1
< 0.1%
Distinct15024
Distinct (%)12.2%
Missing0
Missing (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:26.578142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length30
Median length24
Mean length15.16379774
Min length3

Characters and Unicode

Total characters1860507
Distinct characters50
Distinct categories6 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique8146 ?
Unique (%)6.6%

Sample

1st rowOPERADOR TO
2nd rowOPERADOR TO
3rd rowJUAN20220131132107
4th rowCHALLENGER S.A.S.
5th rowIXCOMERCIO COLOMBIA
ValueCountFrequency (%)
sas 3316
 
2.0%
colombia 2147
 
1.3%
s.a.s 1618
 
1.0%
store 1475
 
0.9%
tienda 1386
 
0.8%
hitmusical 1094
 
0.7%
discotienda 1094
 
0.7%
online 917
 
0.6%
liberimportacinsasliberimp 885
 
0.5%
gruporespaldoinmobiliariosa 848
 
0.5%
Other values (15950) 148732
91.0%
2024-02-24T20:31:27.124146image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
A 179896
 
9.7%
O 169360
 
9.1%
E 152388
 
8.2%
I 140463
 
7.5%
S 122277
 
6.6%
R 113705
 
6.1%
C 100192
 
5.4%
L 90427
 
4.9%
T 88788
 
4.8%
N 87958
 
4.7%
Other values (40) 615053
33.1%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 1653420
88.9%
Decimal Number 131199
 
7.1%
Space Separator 40819
 
2.2%
Other Punctuation 17634
 
0.9%
Connector Punctuation 11982
 
0.6%
Dash Punctuation 5453
 
0.3%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
A 179896
10.9%
O 169360
10.2%
E 152388
 
9.2%
I 140463
 
8.5%
S 122277
 
7.4%
R 113705
 
6.9%
C 100192
 
6.1%
L 90427
 
5.5%
T 88788
 
5.4%
N 87958
 
5.3%
Other values (23) 407966
24.7%
Decimal Number
ValueCountFrequency (%)
2 27312
20.8%
0 21523
16.4%
1 19542
14.9%
3 12211
9.3%
4 9614
 
7.3%
5 9364
 
7.1%
9 8367
 
6.4%
7 8012
 
6.1%
6 7877
 
6.0%
8 7377
 
5.6%
Other Punctuation
ValueCountFrequency (%)
. 17626
> 99.9%
* 6
 
< 0.1%
: 1
 
< 0.1%
@ 1
 
< 0.1%
Space Separator
ValueCountFrequency (%)
40819
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 11982
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 5453
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1653420
88.9%
Common 207087
 
11.1%

Most frequent character per script

Latin
ValueCountFrequency (%)
A 179896
10.9%
O 169360
10.2%
E 152388
 
9.2%
I 140463
 
8.5%
S 122277
 
7.4%
R 113705
 
6.9%
C 100192
 
6.1%
L 90427
 
5.5%
T 88788
 
5.4%
N 87958
 
5.3%
Other values (23) 407966
24.7%
Common
ValueCountFrequency (%)
40819
19.7%
2 27312
13.2%
0 21523
10.4%
1 19542
9.4%
. 17626
8.5%
3 12211
 
5.9%
_ 11982
 
5.8%
4 9614
 
4.6%
5 9364
 
4.5%
9 8367
 
4.0%
Other values (7) 28727
13.9%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1859006
99.9%
None 1501
 
0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
A 179896
 
9.7%
O 169360
 
9.1%
E 152388
 
8.2%
I 140463
 
7.6%
S 122277
 
6.6%
R 113705
 
6.1%
C 100192
 
5.4%
L 90427
 
4.9%
T 88788
 
4.8%
N 87958
 
4.7%
Other values (33) 613552
33.0%
None
ValueCountFrequency (%)
Í 461
30.7%
Ñ 282
18.8%
Ó 281
18.7%
Á 232
15.5%
É 189
12.6%
Ú 52
 
3.5%
Ü 4
 
0.3%

installments_quantity
Real number (ℝ)

MISSING 

Distinct2
Distinct (%)< 0.1%
Missing26197
Missing (%)21.4%
Infinite0
Infinite (%)0.0%
Mean28.22251469
Minimum12
Maximum36
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:27.321143image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum12
5-th percentile12
Q112
median36
Q336
95-th percentile36
Maximum36
Range24
Interquartile range (IQR)24

Descriptive statistics

Standard deviation11.23261667
Coefficient of variation (CV)0.3980019779
Kurtosis-1.434756459
Mean28.22251469
Median Absolute Deviation (MAD)0
Skewness-0.751846579
Sum2723388
Variance126.1716772
MonotonicityNot monotonic
2024-02-24T20:31:27.483142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=2)
ValueCountFrequency (%)
36 65226
53.2%
12 31271
25.5%
(Missing) 26197
21.4%
ValueCountFrequency (%)
12 31271
25.5%
36 65226
53.2%
ValueCountFrequency (%)
36 65226
53.2%
12 31271
25.5%

installments_amount
Real number (ℝ)

MISSING  SKEWED 

Distinct25952
Distinct (%)26.9%
Missing26197
Missing (%)21.4%
Infinite0
Infinite (%)0.0%
Mean13664.68919
Minimum546.81
Maximum4086158.33
Zeros0
Zeros (%)0.0%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:27.693145image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum546.81
5-th percentile732.86
Q11815.42
median3747.22
Q38886.11
95-th percentile48283.336
Maximum4086158.33
Range4085611.52
Interquartile range (IQR)7070.69

Descriptive statistics

Standard deviation61229.52166
Coefficient of variation (CV)4.480857254
Kurtosis1099.128579
Mean13664.68919
Median Absolute Deviation (MAD)2466.66
Skewness26.51384463
Sum1318601512
Variance3749054323
MonotonicityNot monotonic
2024-02-24T20:31:27.943139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=50)
ValueCountFrequency (%)
2500 771
 
0.6%
4166.67 474
 
0.4%
1666.67 462
 
0.4%
5000 442
 
0.4%
1108.33 430
 
0.4%
7500 408
 
0.3%
5833.33 382
 
0.3%
1386.11 375
 
0.3%
2083.33 374
 
0.3%
552.78 367
 
0.3%
Other values (25942) 92012
75.0%
(Missing) 26197
 
21.4%
ValueCountFrequency (%)
546.81 1
 
< 0.1%
546.94 2
 
< 0.1%
547.19 2
 
< 0.1%
547.22 20
< 0.1%
547.42 1
 
< 0.1%
ValueCountFrequency (%)
4086158.33 1
< 0.1%
3583333.33 1
< 0.1%
3453241.67 1
< 0.1%
3333333.33 1
< 0.1%
2962575 1
< 0.1%

installments_rate
Real number (ℝ)

CONSTANT  MISSING  ZEROS 

Distinct1
Distinct (%)< 0.1%
Missing26197
Missing (%)21.4%
Infinite0
Infinite (%)0.0%
Mean0
Minimum0
Maximum0
Zeros96497
Zeros (%)78.6%
Negative0
Negative (%)0.0%
Memory size958.7 KiB
2024-02-24T20:31:28.150142image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Quantile statistics

Minimum0
5-th percentile0
Q10
median0
Q30
95-th percentile0
Maximum0
Range0
Interquartile range (IQR)0

Descriptive statistics

Standard deviation0
Coefficient of variation (CV)nan
Kurtosis0
Mean0
Median Absolute Deviation (MAD)0
Skewness0
Sum0
Variance0
MonotonicityIncreasing
2024-02-24T20:31:28.315144image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/
Histogram with fixed size bins (bins=1)
ValueCountFrequency (%)
0 96497
78.6%
(Missing) 26197
 
21.4%
ValueCountFrequency (%)
0 96497
78.6%
ValueCountFrequency (%)
0 96497
78.6%

installments_currency_id
Text

CONSTANT  MISSING 

Distinct1
Distinct (%)< 0.1%
Missing26197
Missing (%)21.4%
Memory size958.7 KiB
2024-02-24T20:31:28.428139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length3
Median length3
Mean length3
Min length3

Characters and Unicode

Total characters289491
Distinct characters3
Distinct categories1 ?
Distinct scripts1 ?
Distinct blocks1 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique0 ?
Unique (%)0.0%

Sample

1st rowCOP
2nd rowCOP
3rd rowCOP
4th rowCOP
5th rowCOP
ValueCountFrequency (%)
cop 96497
100.0%
2024-02-24T20:31:28.752143image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
C 96497
33.3%
O 96497
33.3%
P 96497
33.3%

Most occurring categories

ValueCountFrequency (%)
Uppercase Letter 289491
100.0%

Most frequent character per category

Uppercase Letter
ValueCountFrequency (%)
C 96497
33.3%
O 96497
33.3%
P 96497
33.3%

Most occurring scripts

ValueCountFrequency (%)
Latin 289491
100.0%

Most frequent character per script

Latin
ValueCountFrequency (%)
C 96497
33.3%
O 96497
33.3%
P 96497
33.3%

Most occurring blocks

ValueCountFrequency (%)
ASCII 289491
100.0%

Most frequent character per block

ASCII
ValueCountFrequency (%)
C 96497
33.3%
O 96497
33.3%
P 96497
33.3%

brand_value_name
Text

MISSING 

Distinct19762
Distinct (%)19.2%
Missing19767
Missing (%)16.1%
Memory size958.7 KiB
2024-02-24T20:31:29.150141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length214
Median length200
Mean length8.579264916
Min length1

Characters and Unicode

Total characters883038
Distinct characters125
Distinct categories16 ?
Distinct scripts2 ?
Distinct blocks4 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique11330 ?
Unique (%)11.0%

Sample

1st rowEnergizer
2nd rowEnergizer
3rd rowShenzhen Yihaotong
4th rowChallenger
5th rowJBL
ValueCountFrequency (%)
genérica 10636
 
7.7%
truper 1057
 
0.8%
de 1044
 
0.8%
casio 757
 
0.6%
653
 
0.5%
sony 631
 
0.5%
samsung 623
 
0.5%
nintendo 601
 
0.4%
la 566
 
0.4%
home 535
 
0.4%
Other values (18028) 120443
87.6%
2024-02-24T20:31:29.836141image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
e 72034
 
8.2%
a 71810
 
8.1%
i 57886
 
6.6%
o 54973
 
6.2%
r 54834
 
6.2%
n 49993
 
5.7%
34765
 
3.9%
l 34331
 
3.9%
t 32871
 
3.7%
c 31132
 
3.5%
Other values (115) 388409
44.0%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 629774
71.3%
Uppercase Letter 207288
 
23.5%
Space Separator 34771
 
3.9%
Other Punctuation 4105
 
0.5%
Decimal Number 4085
 
0.5%
Dash Punctuation 2143
 
0.2%
Math Symbol 309
 
< 0.1%
Connector Punctuation 250
 
< 0.1%
Modifier Symbol 152
 
< 0.1%
Final Punctuation 50
 
< 0.1%
Other values (6) 111
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 72034
11.4%
a 71810
11.4%
i 57886
9.2%
o 54973
 
8.7%
r 54834
 
8.7%
n 49993
 
7.9%
l 34331
 
5.5%
t 32871
 
5.2%
c 31132
 
4.9%
s 30331
 
4.8%
Other values (37) 139579
22.2%
Uppercase Letter
ValueCountFrequency (%)
G 17379
 
8.4%
S 15593
 
7.5%
A 14583
 
7.0%
C 13197
 
6.4%
M 12784
 
6.2%
E 12226
 
5.9%
T 12151
 
5.9%
P 10561
 
5.1%
L 9656
 
4.7%
O 9647
 
4.7%
Other values (25) 79511
38.4%
Other Punctuation
ValueCountFrequency (%)
. 1299
31.6%
' 971
23.7%
& 841
20.5%
, 610
14.9%
/ 151
 
3.7%
: 92
 
2.2%
% 64
 
1.6%
! 50
 
1.2%
* 14
 
0.3%
# 5
 
0.1%
Other values (3) 8
 
0.2%
Decimal Number
ValueCountFrequency (%)
0 1016
24.9%
1 697
17.1%
3 597
14.6%
2 513
12.6%
4 311
 
7.6%
5 265
 
6.5%
6 249
 
6.1%
8 184
 
4.5%
7 164
 
4.0%
9 89
 
2.2%
Math Symbol
ValueCountFrequency (%)
+ 307
99.4%
± 1
 
0.3%
> 1
 
0.3%
Other Symbol
ValueCountFrequency (%)
® 12
66.7%
° 4
 
22.2%
2
 
11.1%
Space Separator
ValueCountFrequency (%)
34765
> 99.9%
  6
 
< 0.1%
Dash Punctuation
ValueCountFrequency (%)
- 2142
> 99.9%
1
 
< 0.1%
Modifier Symbol
ValueCountFrequency (%)
´ 150
98.7%
` 2
 
1.3%
Open Punctuation
ValueCountFrequency (%)
( 43
97.7%
[ 1
 
2.3%
Connector Punctuation
ValueCountFrequency (%)
_ 250
100.0%
Final Punctuation
ValueCountFrequency (%)
50
100.0%
Close Punctuation
ValueCountFrequency (%)
) 43
100.0%
Other Letter
ValueCountFrequency (%)
º 4
100.0%
Other Number
ValueCountFrequency (%)
² 1
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 837066
94.8%
Common 45972
 
5.2%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 72034
 
8.6%
a 71810
 
8.6%
i 57886
 
6.9%
o 54973
 
6.6%
r 54834
 
6.6%
n 49993
 
6.0%
l 34331
 
4.1%
t 32871
 
3.9%
c 31132
 
3.7%
s 30331
 
3.6%
Other values (73) 346871
41.4%
Common
ValueCountFrequency (%)
34765
75.6%
- 2142
 
4.7%
. 1299
 
2.8%
0 1016
 
2.2%
' 971
 
2.1%
& 841
 
1.8%
1 697
 
1.5%
, 610
 
1.3%
3 597
 
1.3%
2 513
 
1.1%
Other values (32) 2521
 
5.5%

Most occurring blocks

ValueCountFrequency (%)
ASCII 868878
98.4%
None 14107
 
1.6%
Punctuation 51
 
< 0.1%
Letterlike Symbols 2
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
e 72034
 
8.3%
a 71810
 
8.3%
i 57886
 
6.7%
o 54973
 
6.3%
r 54834
 
6.3%
n 49993
 
5.8%
34765
 
4.0%
l 34331
 
4.0%
t 32871
 
3.8%
c 31132
 
3.6%
Other values (75) 374249
43.1%
None
ValueCountFrequency (%)
é 11248
79.7%
í 704
 
5.0%
ó 682
 
4.8%
ñ 315
 
2.2%
á 277
 
2.0%
ä 214
 
1.5%
ú 153
 
1.1%
´ 150
 
1.1%
Ñ 77
 
0.5%
ü 30
 
0.2%
Other values (27) 257
 
1.8%
Punctuation
ValueCountFrequency (%)
50
98.0%
1
 
2.0%
Letterlike Symbols
ValueCountFrequency (%)
2
100.0%

location
Text

MISSING 

Distinct6566
Distinct (%)62.5%
Missing112183
Missing (%)91.4%
Memory size958.7 KiB
2024-02-24T20:31:30.205140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length514
Median length414
Mean length338.5627438
Min length234

Characters and Unicode

Total characters3558633
Distinct characters95
Distinct categories13 ?
Distinct scripts2 ?
Distinct blocks3 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique5786 ?
Unique (%)55.0%

Sample

1st row{'address_line': 'Cl. 181a #7-28, Bogotá, Colombia', 'zip_code': '', 'subneighborhood': None, 'neighborhood': {'id': 'TUNPQlNBTjkyNDYy', 'name': 'San Antonio Norte'}, 'city': {'id': 'TUNPQ1VTQTY3MTQ1', 'name': 'Usaquén'}, 'state': {'id': 'TUNPUEJPR1gxMDljZA', 'name': 'Bogotá D.C.'}, 'country': {'id': 'CO', 'name': 'Colombia'}, 'latitude': 4.7583128, 'longitude': -74.0261957}
2nd row{'address_line': 'Calle 145a #13a-90, Bogotá, Colombia', 'zip_code': '', 'subneighborhood': None, 'neighborhood': {'id': 'TUNPQkNFRDEzNzQw', 'name': 'Cedritos'}, 'city': {'id': 'TUNPQ1VTQTY3MTQ1', 'name': 'Usaquén'}, 'state': {'id': 'TUNPUEJPR1gxMDljZA', 'name': 'Bogotá D.C.'}, 'country': {'id': 'CO', 'name': 'Colombia'}, 'latitude': 4.7268959, 'longitude': -74.0401792}
3rd row{'address_line': 'Cl. 175 #15-20, Bogotá, Colombia', 'zip_code': '', 'subneighborhood': None, 'neighborhood': {'id': 'TUNPQkxBNzE0ODc', 'name': 'La Alameda'}, 'city': {'id': 'TUNPQ1VTQTY3MTQ1', 'name': 'Usaquén'}, 'state': {'id': 'TUNPUEJPR1gxMDljZA', 'name': 'Bogotá D.C.'}, 'country': {'id': 'CO', 'name': 'Colombia'}, 'latitude': 4.7553061, 'longitude': -74.0373278}
4th row{'address_line': 'Transversal 5d, El Poblado, Medellín, Antioquia, Colombia', 'zip_code': '', 'subneighborhood': None, 'neighborhood': {'id': 'TUNPQkVMUDI3MDA0Ng', 'name': 'El Poblado'}, 'city': {'id': 'TUNPQ01FRGRjNjc4', 'name': 'Medellín'}, 'state': {'id': 'TUNPUEFOVGFiZWI3', 'name': 'Antioquia'}, 'country': {'id': 'CO', 'name': 'Colombia'}, 'latitude': 6.2058014, 'longitude': -75.5697919}
5th row{'address_line': 'Sur', 'zip_code': '', 'subneighborhood': None, 'neighborhood': {}, 'city': {'id': 'TUNPQ0NBTDYyZDA0', 'name': 'Cali'}, 'state': {'id': 'TUNPUFZBTGExNmNjNg', 'name': 'Valle Del Cauca'}, 'country': {'id': 'CO', 'name': 'Colombia'}, 'latitude': 3.3704066, 'longitude': -76.5053344}
ValueCountFrequency (%)
name 39420
 
11.7%
id 39420
 
11.7%
18884
 
5.6%
colombia 12736
 
3.8%
country 10569
 
3.1%
city 10512
 
3.1%
co 10511
 
3.1%
state 10511
 
3.1%
address_line 10511
 
3.1%
neighborhood 10511
 
3.1%
Other values (17446) 163607
48.5%
2024-02-24T20:31:30.820140image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 542986
 
15.3%
327103
 
9.2%
: 171632
 
4.8%
o 157739
 
4.4%
e 154582
 
4.3%
i 148529
 
4.2%
a 129429
 
3.6%
, 127736
 
3.6%
n 123396
 
3.5%
d 122275
 
3.4%
Other values (85) 1553226
43.6%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 1527120
42.9%
Other Punctuation 874091
24.6%
Uppercase Letter 469105
 
13.2%
Space Separator 327103
 
9.2%
Decimal Number 222630
 
6.3%
Open Punctuation 52644
 
1.5%
Close Punctuation 52644
 
1.5%
Connector Punctuation 21029
 
0.6%
Dash Punctuation 12126
 
0.3%
Math Symbol 133
 
< 0.1%
Other values (3) 8
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
o 157739
 
10.3%
e 154582
 
10.1%
i 148529
 
9.7%
a 129429
 
8.5%
n 123396
 
8.1%
d 122275
 
8.0%
t 89032
 
5.8%
l 72362
 
4.7%
m 59752
 
3.9%
r 59366
 
3.9%
Other values (23) 410658
26.9%
Uppercase Letter
ValueCountFrequency (%)
N 59860
12.8%
U 49232
 
10.5%
T 48165
 
10.3%
C 37860
 
8.1%
P 37317
 
8.0%
Q 29500
 
6.3%
M 19243
 
4.1%
O 19051
 
4.1%
D 18873
 
4.0%
E 16819
 
3.6%
Other values (19) 133185
28.4%
Other Punctuation
ValueCountFrequency (%)
' 542986
62.1%
: 171632
 
19.6%
, 127736
 
14.6%
. 29346
 
3.4%
# 2312
 
0.3%
/ 52
 
< 0.1%
& 20
 
< 0.1%
· 3
 
< 0.1%
\ 2
 
< 0.1%
¿ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
1 32775
14.7%
4 28179
12.7%
7 25015
11.2%
0 23839
10.7%
5 23660
10.6%
3 22665
10.2%
6 19875
8.9%
2 18668
8.4%
8 14066
6.3%
9 13888
6.2%
Open Punctuation
ValueCountFrequency (%)
{ 52555
99.8%
( 89
 
0.2%
Close Punctuation
ValueCountFrequency (%)
} 52555
99.8%
) 89
 
0.2%
Dash Punctuation
ValueCountFrequency (%)
- 12122
> 99.9%
4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
327103
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 21029
100.0%
Math Symbol
ValueCountFrequency (%)
+ 133
100.0%
Other Symbol
ValueCountFrequency (%)
° 4
100.0%
Other Letter
ValueCountFrequency (%)
ª 3
100.0%
Currency Symbol
ValueCountFrequency (%)
$ 1
100.0%

Most occurring scripts

ValueCountFrequency (%)
Latin 1996228
56.1%
Common 1562405
43.9%

Most frequent character per script

Latin
ValueCountFrequency (%)
o 157739
 
7.9%
e 154582
 
7.7%
i 148529
 
7.4%
a 129429
 
6.5%
n 123396
 
6.2%
d 122275
 
6.1%
t 89032
 
4.5%
l 72362
 
3.6%
N 59860
 
3.0%
m 59752
 
3.0%
Other values (53) 879272
44.0%
Common
ValueCountFrequency (%)
' 542986
34.8%
327103
20.9%
: 171632
 
11.0%
, 127736
 
8.2%
{ 52555
 
3.4%
} 52555
 
3.4%
1 32775
 
2.1%
. 29346
 
1.9%
4 28179
 
1.8%
7 25015
 
1.6%
Other values (22) 172523
 
11.0%

Most occurring blocks

ValueCountFrequency (%)
ASCII 3546508
99.7%
None 12121
 
0.3%
Punctuation 4
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 542986
 
15.3%
327103
 
9.2%
: 171632
 
4.8%
o 157739
 
4.4%
e 154582
 
4.4%
i 148529
 
4.2%
a 129429
 
3.6%
, 127736
 
3.6%
n 123396
 
3.5%
d 122275
 
3.4%
Other values (70) 1541101
43.5%
None
ValueCountFrequency (%)
á 6393
52.7%
í 2733
22.5%
é 1662
 
13.7%
ó 820
 
6.8%
ñ 339
 
2.8%
ú 120
 
1.0%
ü 26
 
0.2%
Á 13
 
0.1%
° 4
 
< 0.1%
Ñ 3
 
< 0.1%
Other values (4) 8
 
0.1%
Punctuation
ValueCountFrequency (%)
4
100.0%

seller_contact
Text

MISSING 

Distinct3490
Distinct (%)33.2%
Missing112183
Missing (%)91.4%
Memory size958.7 KiB
2024-02-24T20:31:31.194139image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Length

Max length252
Median length241
Mean length133.2624869
Min length123

Characters and Unicode

Total characters1400722
Distinct characters92
Distinct categories10 ?
Distinct scripts2 ?
Distinct blocks2 ?
The Unicode Standard assigns character properties to each code point, which can be used to analyse textual variables.

Unique

Unique3145 ?
Unique (%)29.9%

Sample

1st row{'contact': '', 'other_info': '', 'webpage': '', 'area_code': '', 'phone': '', 'area_code2': '', 'phone2': '', 'email': ''}
2nd row{'contact': '', 'other_info': '', 'webpage': '', 'area_code': '', 'phone': '', 'area_code2': '', 'phone2': '', 'email': ''}
3rd row{'contact': '', 'other_info': '', 'webpage': '', 'area_code': '', 'phone': '', 'area_code2': '', 'phone2': '', 'email': ''}
4th row{'contact': '', 'other_info': '', 'webpage': '', 'area_code': '', 'phone': '', 'area_code2': '', 'phone2': '', 'email': ''}
5th row{'contact': 'Red Inmobiliaria', 'other_info': '', 'webpage': '', 'area_code': '', 'phone': '', 'area_code2': '', 'phone2': '', 'email': ''}
ValueCountFrequency (%)
78018
44.1%
contact 10513
 
5.9%
other_info 10511
 
5.9%
webpage 10511
 
5.9%
area_code 10511
 
5.9%
phone 10511
 
5.9%
area_code2 10511
 
5.9%
phone2 10511
 
5.9%
email 10511
 
5.9%
grupo 972
 
0.5%
Other values (2197) 13948
 
7.9%
2024-02-24T20:31:31.844353image/svg+xmlMatplotlib v3.8.3, https://matplotlib.org/

Most occurring characters

ValueCountFrequency (%)
' 336351
24.0%
166677
11.9%
e 108749
 
7.8%
: 84108
 
6.0%
o 80476
 
5.7%
a 79635
 
5.7%
, 73621
 
5.3%
n 45092
 
3.2%
c 42961
 
3.1%
r 36203
 
2.6%
Other values (82) 346849
24.8%

Most occurring categories

ValueCountFrequency (%)
Lowercase Letter 616868
44.0%
Other Punctuation 495213
35.4%
Space Separator 166677
 
11.9%
Uppercase Letter 46528
 
3.3%
Connector Punctuation 31533
 
2.3%
Decimal Number 22761
 
1.6%
Close Punctuation 10515
 
0.8%
Open Punctuation 10515
 
0.8%
Dash Punctuation 110
 
< 0.1%
Math Symbol 2
 
< 0.1%

Most frequent character per category

Lowercase Letter
ValueCountFrequency (%)
e 108749
17.6%
o 80476
13.0%
a 79635
12.9%
n 45092
7.3%
c 42961
 
7.0%
r 36203
 
5.9%
p 33733
 
5.5%
t 32531
 
5.3%
h 31711
 
5.1%
i 26727
 
4.3%
Other values (23) 99050
16.1%
Uppercase Letter
ValueCountFrequency (%)
A 6353
13.7%
I 4842
10.4%
R 4741
10.2%
O 3711
 
8.0%
E 3506
 
7.5%
S 2849
 
6.1%
N 2643
 
5.7%
L 2444
 
5.3%
G 2149
 
4.6%
M 1804
 
3.9%
Other values (21) 11486
24.7%
Other Punctuation
ValueCountFrequency (%)
' 336351
67.9%
: 84108
 
17.0%
, 73621
 
14.9%
. 995
 
0.2%
/ 57
 
< 0.1%
& 32
 
< 0.1%
" 30
 
< 0.1%
# 17
 
< 0.1%
% 1
 
< 0.1%
@ 1
 
< 0.1%
Decimal Number
ValueCountFrequency (%)
2 21214
93.2%
3 290
 
1.3%
4 238
 
1.0%
1 228
 
1.0%
0 175
 
0.8%
8 159
 
0.7%
5 138
 
0.6%
9 124
 
0.5%
6 109
 
0.5%
7 86
 
0.4%
Close Punctuation
ValueCountFrequency (%)
} 10511
> 99.9%
) 4
 
< 0.1%
Open Punctuation
ValueCountFrequency (%)
{ 10511
> 99.9%
( 4
 
< 0.1%
Space Separator
ValueCountFrequency (%)
166677
100.0%
Connector Punctuation
ValueCountFrequency (%)
_ 31533
100.0%
Dash Punctuation
ValueCountFrequency (%)
- 110
100.0%
Math Symbol
ValueCountFrequency (%)
+ 2
100.0%

Most occurring scripts

ValueCountFrequency (%)
Common 737326
52.6%
Latin 663396
47.4%

Most frequent character per script

Latin
ValueCountFrequency (%)
e 108749
16.4%
o 80476
12.1%
a 79635
12.0%
n 45092
 
6.8%
c 42961
 
6.5%
r 36203
 
5.5%
p 33733
 
5.1%
t 32531
 
4.9%
h 31711
 
4.8%
i 26727
 
4.0%
Other values (54) 145578
21.9%
Common
ValueCountFrequency (%)
' 336351
45.6%
166677
22.6%
: 84108
 
11.4%
, 73621
 
10.0%
_ 31533
 
4.3%
2 21214
 
2.9%
} 10511
 
1.4%
{ 10511
 
1.4%
. 995
 
0.1%
3 290
 
< 0.1%
Other values (18) 1515
 
0.2%

Most occurring blocks

ValueCountFrequency (%)
ASCII 1400385
> 99.9%
None 337
 
< 0.1%

Most frequent character per block

ASCII
ValueCountFrequency (%)
' 336351
24.0%
166677
11.9%
e 108749
 
7.8%
: 84108
 
6.0%
o 80476
 
5.7%
a 79635
 
5.7%
, 73621
 
5.3%
n 45092
 
3.2%
c 42961
 
3.1%
r 36203
 
2.6%
Other values (70) 346512
24.7%
None
ValueCountFrequency (%)
Ñ 75
22.3%
ñ 64
19.0%
á 46
13.6%
í 46
13.6%
ó 39
11.6%
é 24
 
7.1%
Ó 13
 
3.9%
Í 11
 
3.3%
Á 9
 
2.7%
É 5
 
1.5%
Other values (2) 5
 
1.5%